根据线性随机微分方程进化的扩散过程是连续时间动态决策模型的重要家族。最佳政策对它们进行了充分研究,并确定了漂移矩阵。然而,对于不确定的漂移矩阵的扩散过程的数据驱动的控制知之甚少,因为常规离散时间分析技术不适用。此外,尽管该任务可以被视为涉及探索和剥削权衡取舍的强化学习问题,但确保系统稳定性是设计最佳政策的基本组成部分。我们确定流行的汤普森采样算法可以快速学习最佳动作,仅产生了时间根的遗憾,并在短时间内稳定了系统。据我们所知,这是汤普森在扩散过程控制问题中抽样的第一个结果。我们通过从两个飞机和血糖控制的两个设置的实际参数矩阵的经验模拟来验证理论结果。此外,我们观察到,与最先进的算法相比,汤普森采样显着改善(最坏的)遗憾,这表明汤普森采样以一种更加保护的方式探索。我们的理论分析涉及特定的特定最优歧管,该歧管将漂移参数的局部几何形状与扩散过程的最佳控制。我们希望这项技术具有更广泛的兴趣。
translated by 谷歌翻译
在本说明书中,我们介绍了众所周知的椭圆形潜在的引理的一般版本,这是一种广泛使用的技术在分析顺序学习和决策问题中的算法中。我们考虑一个随机线性匪徒设置,其中决策者在一组给定的行动中顺序选择,观察他们的嘈杂奖励,并旨在通过决策地平线最大化她的累积预期奖励。椭圆潜力引理是一种用于量化奖励功能参数的不确定性的关键工具,但它需要噪声和现有的分布成为高斯。我们的一般椭圆潜力引理放松了这种高斯要求,这是一种非常非琐碎的延伸,原因如上所述;与高斯案例不同,对后部分布的协方差矩阵没有闭合形式解决方案,协方差矩阵不是动作的确定性函数,并且协方差矩阵对于SEMIDEFINITE不等式而不是降低。虽然这一结果具有广泛的兴趣,但我们展示了它的应用,以证明具有在随机线性匪徒中的众所周知的汤普森采样算法的改进的贝叶斯遗憾,其中具有先前和噪声分布的改变动作集。这界限最多是常量的最佳状态。
translated by 谷歌翻译
在本文中,我们研究了在一组单位上进行的设计实验的问题,例如在线市场中的用户或用户组,以多个时间段,例如数周或数月。这些实验特别有助于研究对当前和未来结果具有因果影响的治疗(瞬时和滞后的影响)。设计问题涉及在实验之前或期间选择每个单元的治疗时间,以便最精确地估计瞬间和滞后的效果,实验后。这种治疗决策的优化可以通过降低其样本尺寸要求,直接最小化实验的机会成本。优化是我们提供近最优解的NP-Hard整数程序,当时在开始时进行设计决策(固定样本大小设计)。接下来,我们研究允许在实验期间进行适应性决策的顺序实验,并且还可能早期停止实验,进一步降低其成本。然而,这些实验的顺序性质使设计阶段和估计阶段复杂化。我们提出了一种新的算法,PGAE,通过自适应地制造治疗决策,估算治疗效果和绘制有效的实验后推理来解决这些挑战。 PGAE将来自贝叶斯统计,动态编程和样品分裂的思想结合起来。使用来自多个域的真实数据集的合成实验,我们证明了与基准相比,我们的固定样本尺寸和顺序实验的提出解决方案将实验的机会成本降低了50%和70%。
translated by 谷歌翻译
In this paper, we study the trace regression when a matrix of parameters B* is estimated via the convex relaxation of a rank-regularized regression or via regularized non-convex optimization. It is known that these estimators satisfy near-optimal error bounds under assumptions on the rank, coherence, and spikiness of B*. We start by introducing a general notion of spikiness for B* that provides a generic recipe to prove the restricted strong convexity of the sampling operator of the trace regression and obtain near-optimal and non-asymptotic error bounds for the estimation error. Similar to the existing literature, these results require the regularization parameter to be above a certain theory-inspired threshold that depends on observation noise that may be unknown in practice. Next, we extend the error bounds to cases where the regularization parameter is chosen via cross-validation. This result is significant in that existing theoretical results on cross-validated estimators (Kale et al., 2011; Kumar et al., 2013; Abou-Moustafa and Szepesvari, 2017) do not apply to our setting since the estimators we study are not known to satisfy their required notion of stability. Finally, using simulations on synthetic and real data, we show that the cross-validated estimator selects a near-optimal penalty parameter and outperforms the theory-inspired approach of selecting the parameter.
translated by 谷歌翻译
Code generation from text requires understanding the user's intent from a natural language description (NLD) and generating an executable program code snippet that satisfies this intent. While recent pretrained language models (PLMs) demonstrate remarkable performance for this task, these models fail when the given NLD is ambiguous due to the lack of enough specifications for generating a high-quality code snippet. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that ambiguities in the specifications of an NLD are resolved by asking clarification questions (CQs). Therefore, we collect and introduce a new dataset named CodeClarQA containing NLD-Code pairs with created CQAs. We evaluate the performance of PLMs for code generation on our dataset. The empirical results support our hypothesis that clarifications result in more precise generated code, as shown by an improvement of 17.52 in BLEU, 12.72 in CodeBLEU, and 7.7\% in the exact match. Alongside this, our task and dataset introduce new challenges to the community, including when and what CQs should be asked.
translated by 谷歌翻译
In data-driven systems, data exploration is imperative for making real-time decisions. However, big data is stored in massive databases that are difficult to retrieve. Approximate Query Processing (AQP) is a technique for providing approximate answers to aggregate queries based on a summary of the data (synopsis) that closely replicates the behavior of the actual data, which can be useful where an approximate answer to the queries would be acceptable in a fraction of the real execution time. In this paper, we discuss the use of Generative Adversarial Networks (GANs) for generating tabular data that can be employed in AQP for synopsis construction. We first discuss the challenges associated with constructing synopses in relational databases and then introduce solutions to those challenges. Following that, we organized statistical metrics to evaluate the quality of the generated synopses. We conclude that tabular data complexity makes it difficult for algorithms to understand relational database semantics during training, and improved versions of tabular GANs are capable of constructing synopses to revolutionize data-driven decision-making systems.
translated by 谷歌翻译
Hawkes processes have recently risen to the forefront of tools when it comes to modeling and generating sequential events data. Multidimensional Hawkes processes model both the self and cross-excitation between different types of events and have been applied successfully in various domain such as finance, epidemiology and personalized recommendations, among others. In this work we present an adaptation of the Frank-Wolfe algorithm for learning multidimensional Hawkes processes. Experimental results show that our approach has better or on par accuracy in terms of parameter estimation than other first order methods, while enjoying a significantly faster runtime.
translated by 谷歌翻译
Graph neural networks have shown to learn effective node representations, enabling node-, link-, and graph-level inference. Conventional graph networks assume static relations between nodes, while relations between entities in a video often evolve over time, with nodes entering and exiting dynamically. In such temporally-dynamic graphs, a core problem is inferring the future state of spatio-temporal edges, which can constitute multiple types of relations. To address this problem, we propose MTD-GNN, a graph network for predicting temporally-dynamic edges for multiple types of relations. We propose a factorized spatio-temporal graph attention layer to learn dynamic node representations and present a multi-task edge prediction loss that models multiple relations simultaneously. The proposed architecture operates on top of scene graphs that we obtain from videos through object detection and spatio-temporal linking. Experimental evaluations on ActionGenome and CLEVRER show that modeling multiple relations in our temporally-dynamic graph network can be mutually beneficial, outperforming existing static and spatio-temporal graph neural networks, as well as state-of-the-art predicate classification methods.
translated by 谷歌翻译
The Longest Common Subsequence (LCS) is the problem of finding a subsequence among a set of strings that has two properties of being common to all and is the longest. The LCS has applications in computational biology and text editing, among many others. Due to the NP-hardness of the general longest common subsequence, numerous heuristic algorithms and solvers have been proposed to give the best possible solution for different sets of strings. None of them has the best performance for all types of sets. In addition, there is no method to specify the type of a given set of strings. Besides that, the available hyper-heuristic is not efficient and fast enough to solve this problem in real-world applications. This paper proposes a novel hyper-heuristic to solve the longest common subsequence problem using a novel criterion to classify a set of strings based on their similarity. To do this, we offer a general stochastic framework to identify the type of a given set of strings. Following that, we introduce the set similarity dichotomizer ($S^2D$) algorithm based on the framework that divides the type of sets into two. This algorithm is introduced for the first time in this paper and opens a new way to go beyond the current LCS solvers. Then, we present a novel hyper-heuristic that exploits the $S^2D$ and one of the internal properties of the set to choose the best matching heuristic among a set of heuristics. We compare the results on benchmark datasets with the best heuristics and hyper-heuristics. The results show a higher performance of our proposed hyper-heuristic in both quality of solutions and run time factors.
translated by 谷歌翻译
Continuous behavioural authentication methods add a unique layer of security by allowing individuals to verify their unique identity when accessing a device. Maintaining session authenticity is now feasible by monitoring users' behaviour while interacting with a mobile or Internet of Things (IoT) device, making credential theft and session hijacking ineffective. Such a technique is made possible by integrating the power of artificial intelligence and Machine Learning (ML). Most of the literature focuses on training machine learning for the user by transmitting their data to an external server, subject to private user data exposure to threats. In this paper, we propose a novel Federated Learning (FL) approach that protects the anonymity of user data and maintains the security of his data. We present a warmup approach that provides a significant accuracy increase. In addition, we leverage the transfer learning technique based on feature extraction to boost the models' performance. Our extensive experiments based on four datasets: MNIST, FEMNIST, CIFAR-10 and UMDAA-02-FD, show a significant increase in user authentication accuracy while maintaining user privacy and data security.
translated by 谷歌翻译